Goto

Collaborating Authors

 nlp model


destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity

Ahmed, Saadat Rafid, Shareen, Rubayet, Sharkar, Radoan, Hossain, Nazia, Mahi, Mansur, Sadeque, Farig Yousuf

arXiv.org Artificial Intelligence

Advancements in Machine Learning & Neural Networks in recent years have led to widespread implementations of Natural Language Processing across a variety of fields with remarkable success, solving a wide range of complicated problems. However, recent research has shown that machine learning models may be vulnerable in a number of ways, putting both the models and the systems theyre used in at risk. In this paper, we intend to analyze and experiment with the best of existing adversarial attack recipes and create new ones. We concentrated on developing a novel adversarial attack strategy on current state-of-the-art machine learning models by producing ambiguous inputs for the models to confound them and then constructing the path to the future development of the robustness of the models. We will develop adversarial instances with maximum perplexity, utilizing machine learning and deep learning approaches in order to trick the models. In our attack recipe, we will analyze several datasets and focus on creating obfuscous adversary examples to put the models in a state of perplexity, and by including the Bangla Language in the field of adversarial attacks. We strictly uphold utility usage reduction and efficiency throughout our work.


Appendix for: Data-Aware Low-Rank Compression for Large NLP Models A Proof of Theorem 1 Theorem 1

Neural Information Processing Systems

In addition, a pre-defined search grid is also necessary. With these input parameters, we firstly distribute the total allowed loss into each individual module. First, it's indeed a trade-off between the efficiency and efficacy as the speedup ratio goes higher at the cost of lower Thus, in the real application, users need to decide what's the best We could have chose another cutoff like 1 % accuracy with lower speedup ratio to report, but this won't help too much when comparing different baseline methods. D.1 LSTM result A 2-layer LSTM model is composed of two large matrices layers and one large softmax layer. Thus, despite the matrix is much smaller and well approximated by DRONE, the overall acceleration on GPU is less.




Utilizing Modern Large Language Models (LLM) for Financial Trend Analysis and Digest Creation

Lazarev, Andrei, Sedov, Dmitrii

arXiv.org Artificial Intelligence

The exponential growth of information presents a significant challenge for researchers and professionals seeking to remain at the forefront of their fields and this paper introduces an innovative framework for automatically generating insightful financial digests using the power of Large Language Models (LLMs), specifically Google's Gemini Pro. By leveraging a combination of data extraction from OpenAlex, strategic prompt engineering, and LLM-driven analysis, we demonstrate the automated example of creating a comprehensive digests that generalize key findings, identify emerging trends. This approach addresses the limitations of traditional analysis methods, enabling the efficient processing of vast amounts of unstructured data and the delivery of actionable insights in an easily digestible format. This paper describes how LLMs work in simple words and how we can use their power to help researchers and scholars save their time and stay informed about current trends. Our study includes step-by-step process, from data acquisition and JSON construction to interaction with Gemini and the automated generation of PDF reports, including a link to the project's GitHub repository for broader accessibility and further development.




Appendix for: Data-Aware Low-Rank Compression for Large NLP Models A Proof of Theorem 1 Theorem 1

Neural Information Processing Systems

In addition, a pre-defined search grid is also necessary. With these input parameters, we firstly distribute the total allowed loss into each individual module. First, it's indeed a trade-off between the efficiency and efficacy as the speedup ratio goes higher at the cost of lower Thus, in the real application, users need to decide what's the best We could have chose another cutoff like 1 % accuracy with lower speedup ratio to report, but this won't help too much when comparing different baseline methods. D.1 LSTM result A 2-layer LSTM model is composed of two large matrices layers and one large softmax layer. Thus, despite the matrix is much smaller and well approximated by DRONE, the overall acceleration on GPU is less.


Adversarial Text Generation with Dynamic Contextual Perturbation

Waghela, Hetvi, Sen, Jaydip, Rakshit, Sneha, Dasgupta, Subhasis

arXiv.org Artificial Intelligence

Adversarial attacks on Natural Language Processing (NLP) models expose vulnerabilities by introducing subtle perturbations to input text, often leading to misclassification while maintaining human readability. Existing methods typically focus on word-level or local text segment alterations, overlooking the broader context, which results in detectable or semantically inconsistent perturbations. We propose a novel adversarial text attack scheme named Dynamic Contextual Perturbation (DCP). DCP dynamically generates context-aware perturbations across sentences, paragraphs, and documents, ensuring semantic fidelity and fluency. Leveraging the capabilities of pre-trained language models, DCP iteratively refines perturbations through an adversarial objective function that balances the dual objectives of inducing model misclassification and preserving the naturalness of the text. This comprehensive approach allows DCP to produce more sophisticated and effective adversarial examples that better mimic natural language patterns. Our experimental results, conducted on various NLP models and datasets, demonstrate the efficacy of DCP in challenging the robustness of state-of-the-art NLP systems. By integrating dynamic contextual analysis, DCP significantly enhances the subtlety and impact of adversarial attacks. This study highlights the critical role of context in adversarial attacks and lays the groundwork for creating more robust NLP systems capable of withstanding sophisticated adversarial strategies.


Quantum Graph Transformer for NLP Sentiment Classification

Aktar, Shamminuj, Bärtschi, Andreas, Badawy, Abdel-Hameed A., Eidenbenz, Stephan

arXiv.org Artificial Intelligence

Quantum machine learning is a promising direction for building more efficient and expressive models, particularly in domains where understanding complex, structured data is critical. We present the Quantum Graph Transformer (QGT), a hybrid graph-based architecture that integrates a quantum self-attention mechanism into the message-passing framework for structured language modeling. The attention mechanism is implemented using parameterized quantum circuits (PQCs), which enable the model to capture rich contextual relationships while significantly reducing the number of trainable parameters compared to classical attention mechanisms. We evaluate QGT on five sentiment classification benchmarks. Experimental results show that QGT consistently achieves higher or comparable accuracy than existing quantum natural language processing (QNLP) models, including both attention-based and non-attention-based approaches. When compared with an equivalent classical graph transformer, QGT yields an average accuracy improvement of 5.42% on real-world datasets and 4.76% on synthetic datasets. Additionally, QGT demonstrates improved sample efficiency, requiring nearly 50% fewer labeled samples to reach comparable performance on the Yelp dataset. These results highlight the potential of graph-based QNLP techniques for advancing efficient and scalable language understanding.